Faster Algorithms for Privately Releasing Marginals
نویسندگان
چکیده
We study the problem of releasing k-way marginals of a database D ∈ ({0, 1}), while preserving differential privacy. The answer to a k-way marginal query is the fraction of D’s records x ∈ {0, 1} with a given value in each of a given set of up to k columns. Marginal queries enable a rich class of statistical analyses of a dataset, and designing efficient algorithms for privately releasing marginal queries has been identified as an important open problem in private data analysis (cf. Barak et. al., PODS ’07). We give an algorithm that runs in time d √ k) and releases a private summary capable of answering any k-way marginal query with at most ±.01 error on every query as long as n ≥ d √ . To our knowledge, ours is the first algorithm capable of privately releasing marginal queries with non-trivial worst-case accuracy guarantees in time substantially smaller than the number of k-way marginal queries, which is d (for k d).
منابع مشابه
2 5 Ju n 20 12 Faster Algorithms for Privately Releasing Marginals ∗
We study the problem of releasing k-way marginals of a database D ∈ ({0, 1}d)n, while preserving differential privacy. The answer to a k-way marginal query is the fraction of D’s records x ∈ {0, 1}d with a given value in each of a given set of up to k columns. Marginal queries enable a rich class of statistical analyses of a dataset, and designing efficient algorithms for privately releasing ma...
متن کاملEfficient Algorithms for Privately Releasing Marginals via Convex Relaxations
Consider a database of n people, each represented by a bit-string of length d corresponding to the setting of d binary attributes. A k-way marginal query is specified by a subset S of k attributes, and a |S|-dimensional binary vector β specifying their values. The result for this query is a count of the number of people in the database whose attribute vector restricted to S agrees with β. Priva...
متن کاملMarginal Release Under Local Differential Privacy
Many analysis and machine learning tasks require the availability of marginal statistics on multidimensional datasets while providing strong privacy guarantees for the data subjects. Applications for these statistics range from finding correlations in the data to fitting sophisticated prediction models. In this paper, we provide a set of algorithms for materializing marginal statistics under th...
متن کامل. D S ] 1 3 A pr 2 01 3 Faster Private Release of Marginals on Small Databases ∗
We study the problem of answering k-way marginal queries on a database D ∈ ({0, 1}d)n, while preserving differential privacy. The answer to a k-way marginal query is the fraction of the database’s records x ∈ {0, 1}d with a given value in each of a given set of up to k columns. Marginal queries enable a rich class of statistical analyses on a dataset, and designing efficient algorithms for priv...
متن کاملEfficient graphical models for sequence segmentation
Segmentation of sequences is an important modeling primitive with several applications. Training and inference of segmentation models involves dynamic programming computations that in the worst case can be cubic in the length of a sequence. In contrast, typical sequence labeling models require linear time. We propose an alternative graphical model for efficient sharing of potentials across over...
متن کامل